Process a lot of information quickly
distributions, relationships, outliers
summary statistics, correlations, e.g. do not tell the whole story
\(~\)
Process a lot of information quickly
distributions, relationships, outliers
summary statistics, correlations, e.g. do not tell the whole story
\(~\)
ggplot2Different philosophies
base R plots
one type of plot (e.g. scatter, box, histogram)
plot arguments for appearance (e.g. labels, colors, sizes, etc.)
add things later (e.g. points, lines) with low-level plot functions
ggplot
add layers to a plot
change appearance of the plot in any of these layers
High-level plot functions create a plot (… are additional arguments)
plot(x, y = NULL, ...) hist(x, ...) boxplot(formula, data, ...) barplot(formula, data, ...)
Low-level plot functions add something to the plot
points(x, y) lines(x, y) abline(a, b, h, v) text(x, y, labels) legend(x, y, legend)
par(mfrow = c(2, 3)) with(mtcars, plot(mpg)) # or plot(mtcars$mpg) with(mtcars, boxplot(mpg)) with(mtcars, plot(factor(gear))) with(mtcars, plot(disp, mpg)) with(mtcars, boxplot(mpg ~ gear)) with(mtcars, barplot(xtabs(~ gear + cyl))) # no legend!
with(mtcars, plot(disp, mpg, col = c("red", "blue")[am + 1])) # add colors
## Warning in Ops.factor(am, 1): '+' not meaningful for factors
legend("topright", legend = c("automatic", "manual"), # add legend
col = c("red", "blue"), pch = 1, title = "transmission")
ggplot2?Layered plotting based on the book The Grammer of Graphics by Leland Wilkinsons.
library(patchwork) # combine plots in one figure ggplot(data = mtcars) + ggplot(mtcars, (aes(x = disp, y = mpg))) + ggplot(mtcars, aes(disp, mpg)) + geom_point()
ggplot objectIt is possible to save plot object, and add layers later
gg <- ggplot(mtcars, aes(disp, mpg)) + geom_point() gg
am (transmission) and loose the confidence bandamgg + geom_smooth() + gg + geom_smooth(se = F) + geom_point(aes(col = am)) + gg + geom_smooth(aes(col = am), method = "lm", se = F) + geom_point(aes(col = am)) + plot_layout(guides = "collect", axes = "collect") # collect labels/legends
gg + geom_smooth(aes(col = am), method = "lm", se = F) + geom_point(aes(col = am)) + labs(x = "displacement", y = "miles per gallon", title = "Gas consumption") + theme_minimal()
gg + geom_point(aes(col = am, size = hp)) + theme_minimal()
geom_point
geom_bar
geom_line
geom_smooth
geom_histogram
geom_boxplot
geom_violin
geom_density
geom_bar
and many more…
ggplot2 cheat sheetggplot(mtcars, aes(mpg)) + geom_histogram(bins = 10) + ggplot(mtcars, aes(mpg)) + geom_density() + ggplot(mtcars, aes(mpg)) + geom_boxplot()
ggplot(mtcars) + geom_bar(aes(gear)) + ggplot(mtcars) + geom_bar(aes(gear, fill = vs)) + ggplot(mtcars) + geom_bar(aes(gear, fill = vs), position = "dodge") + plot_layout(guides = "collect", axes = "collect")
ggplot(mtcars, aes(mpg, am)) + geom_boxplot() + ggplot(mtcars, aes(mpg, fill = am)) + geom_density(alpha = .3) + ggplot(mtcars, aes(mpg, disp)) + geom_density_2d()
Splitting the plot by levels of a categorical variable
gg + geom_smooth(method = "lm", se = F) + facet_grid(cols = vars(gear))
R Markdown has chunk options to resize plots. This is especially useful when more than one plot is displayed in a single plot array.
fig.with = 7 is the default width of the array
fig.height = 5 is the default height of the array
fig.asp = .7 is the default height:width ratio
\(~\)
The defaults can be changed as follows:
```{r fig.asp = .4}
to reduce the height.